The will to end one’s life is commonly believed to stem from various factors such as financial stability, health issues, physical environment, etc. One can easily find multiple data sets on the internet that accurately report the number of suicide cases reported for various prominent cities around the globe and their correlation with numerous factors such as GDP per capita and Life expectancy. These observations give us an objective understanding of these intuitive relations we wish to explore.
The happiness index measures content among the residents of various world countries. The happiness scores are calculated based on a population survey. They are expected to be a function of multiple parameters such as average life expectancy, family income, family size, etc. Various agencies differently model this index, but it heuristically represents the general content with life that people feel in places around the world.
With an intuitively expected relationship between suicides and happiness, this report aims to analyze and check this intuition based on objective metrics and actual world observations.We also aim to compare the trends between the Urban Bliss Index, calculated based on objective parameters, and the World Happiness Index, curated based on extensive surveys.
We primarily use three data sets for all our analysis. Namely:
(Source: Kaggle Datasets)
For various continents, let’s analyse the impact of GDP per capita on the suicide rates of the country. We do this to substantiate our intuition that financial stability in a region has a direct correlation with the number of people committing suicides in that region.
Result: We observe that the GDP per capita has no correlation with the number of suicides per capita.
Remark: Though financial stability on an individual level cannot simply be measured by the per-capita income of a particular region, it still gives us a large scale understanding of the financial states of the residents of a particular region. With a large enough data-set, we can, hence, hope to gauge the general effect of income on the suicide rates.
Let’s observe if the Happiness Index has some connection with financial stability or whether it shows a similar lack of trend as suicide rates.
From our previous analysis, if we want to compare the trend in GDP per capita vs The happiness index, we observe the following plot.
We observe that the happiness index is strongly correlated with the GDP per capita as opposed to the trend observed in the suicide rates. We can further substantiate the result by testing this on various continents:
To understand the impact of Health conditions and life expectancy on suicide rates, we can, firstly, establish the claim that GDP per capita and Life expectancy are highly correlated.
To further substantiate the fact that we have a strong correlation between Health Index and GPD per Capita, we can look at the correlation between the fitted values and the observed values.
cor(dat.happy$Health..Life.Expectancy.,fitted(lm(dat.happy$Health..Life.Expectancy.~dat.happy$Economy..GDP.per.Capita.))) # multiple correlation
## [1] 0.7816253
As expected the correlation values are really high.
This loosely establishes the fact that financially well-off people tend to have lesser health issues and consequently higher life expectancy. We, however, do not observe such a trend between suicide rates and per-capita income.
If we look at the distribution of Happinness across various continents we come across the following plot:
489 * GDP + 1.456 * Life Expectancy
This leads us to believe that there might be several regional and environmental factors that might also lead to a change in the happiness index and symmetrically the suicide rates of that particular region.
## [1] "Difference is Mean of happiness index in Upper Hemisphere to Lower Hemisphere : 0.402737286528037"
## [1] "Deviation in the mean differene : 0.0837894696952222"
## [1] "Hypothesis : Mean of Upper Hemisphere <= Mean of Bottom Hemisphere"
## [1] "P value : 7.67830773618501e-07"
Since the p value is very low, the hypothesis is rejected. Therefore, Mean of Upper Hemisphere is very likely to be higher than the Mean of the Lower Hemisphere.
The question that we want to explore in this section is, “Are people from various age groups more likely to commit suicides?”
If we look at the mean values for all the different age groups we can derive the following conclusion:
With increasing age you are increasingly likely to commit suicide.
We can also try and analyse how prone each generation is to committing suicides.
We can try and analyse a similar understanding in the happiness scores around the globe in these similar generations.
Silent Generation – 1928-1945
Boomers – 1946-1964
Generation X – 1965 - 1980
Millennials – 1981-1996
Generation Z – 1997-2012
We can extrapolate the Happiness Index by using the linear model that we derived in the earlier section to check whether a generation more likely to commit suicide was less happy on an average.
We observe that even in this form of clustering, the hypothesis is not followed.
With the following plots we aim to understand the growth or dip in suicide rates for various continents over time. We do this to try and arrive at a possible correlation between suicide rates, time and region.
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
In Europe, Australia and Africa, suicide rate decreases post 2000s, and in North America, South America, there is a slight increase. In Asia, it is very fluctuating.
Overall we still aren’t able to correlate suicides with any parameter.
With that motivation set, we can look at our final merged data-set that helps us perform cross analysis of all of our parameters.